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Abstract 



Exploiting stochastic path integral theory, we obtain by simulation sub- 
stantial gains in efficiency for the computation of reaction rates in one- 
dimensional, bistable, overdamped stochastic systems. Using a well-defined 
measure of efficiency, we compare implementations of "Dynamic Importance 
Sampling" (DIMS) methods to unbiased simulation. The best DIMS algo- 
rithms are shown to increase efficiency by factors of approximately 20 for a 
5ksT barrier height and 300 for 9ksT, compared to unbiased simulation. The 
gains result from close emulation of natural (unbiased), instanton-like cross- 
ing events with artificially decreased waiting times between events that are 
corrected for in rate calculations. The artificial crossing events are generated 
using the closed-form solution to the most probable crossing event described 
by the Onsager-Machlup action. While the best biasing methods require the 
second derivative of the potential (resulting from the "Jacobian" term in the 
action, which is discussed at length), algorithms employing solely the first 
derivative do nearly as well. We discuss the importance of one-dimensional 
models to larger systems, and suggest extensions to higher-dimensional sys- 
tems. 
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FIG. 2. Time scales and crossing events. An unbiased trajectory exhibits two vastly different 
time scales: for a substantial barrier height, the waiting time between events greatly exceeds the 
time for a single crossing event, (a) The top plot shows an unbiased trajectory of an overdamped 
particle in the double well of Eq. ( |5.1| ), with a barrier height of Ef, = 7ksT. (b) The bottom plot 
isolates a single crossing event from the trajectory in (a) and highlights the rapidity (steepness) of 
the crossing, including of the ascent. Note that the unit of x in both plots is the lengthscale I; see 
Eq. flO]). 
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I. INTRODUCTION 



The rapid computation of the transition rate by numerical simulation for an overdamped 
stochastic particle confined to a one-dimensional double-well potential (Fig. |l|) is a decep- 
tively simple problem, and of crucial importance to making progress in mult i- dimensional 
systems. While straightforward unbiased simulation ultimately yields the rate to any de- 
sired precision (e.g., Q), it requires the simulator to endure many times the waiting period 
between the rare transition events (Fig. ^|a). Such unbiased simulation, moreover, is ut- 
terly impracticable for larger systems — particularly biomolecules possessing thousands of 
atoms, for which simulating a waiting time of 1 millisecond could require tens or hundreds 
of centuries of computer time. The well-known analytic methods for rate computations in 
simple systems |2| are also insufficient for high- dimensional, rough energy landscapes be- 
cause many barriers and metastable states of unknown geometry intercede along multiple 
unknown pathways between the two states of interest (see, e.g., @^]). 

The inadequacy of both analytic methods and straightforward simulation for large bio- 
chemical systems points directly to the need for importance sampling and related methods 
PHT8f . Importance sampling techniques focus computational effort on transition events, typ- 
ically generating an ensemble of transition trajectories from which to estimate rates and/or 
paths. Yet despite the early successes and formal appeal of these methods, a quantita- 
tively successful computational tool for large protein systems with 10 4 atoms has not been 
achieved. We believe the apparently trivial bistable one- dimensional system must be com- 
pletely understood in a simulation context before substantial progress can be expected for 
importance sampling in larger systems. 

The present paper is explicitly computational, or "simulational" : the sole objective is 
to develop efficient simulation methods for one-dimensional systems which are directly or 
indirectly applicable to large systems. We concern ourselves only with the "dynamic impor- 



tance sampling" (DIMS) methods developed by Woolf || and Zuckerman and Woolf [14 
which generate ensembles of fully independent transition trajectories. We hope and believe 
our results will be of practical use in other multi-dimensional methods, but such applica- 
tions are beyond the scope of this work. Rather, the "bottom-line" questions we attempt to 
answer are: (i) Within the DIMS framework, what are the most efficient methods for rate 
computation? (ii) Using a well-defined measure, how efficient are these methods compared 
to unbiased simulation? We will also discuss the maximum efficiency possible using DIMS 
and related methods in one-dimensional (ID) problems, as well as attempting to extrapolate 
to larger systems. 

Achieving efficiency for a ID problem using dynamic importance sampling, as we will 
show, requires the appreciation of a variety of theoretical results — particularly relating 
to stochastic path integrals. The review by Mortensen |l9j and Gardiner's book |2(J give 
introductions to stochastic integrals — of which Ito's and Strantonovich's are the most 
basic types. Historically, Onsager and Machlup first formulated a stochastic path integral 
to describe overdamped Brownian dynamics Since then, uniqueness and other formal 
aspects of the Onsager-Machlup formulation have been discussed by a variety of workers (e.g., 
pT|-p7|). Indeed, formal properties of stochastic functional integrals also play an important 
role in the present investigation. These integrals have been used, for example, to address the 
question of the most likely crossing event (see Figs. 0b, |28H38|. Related functional integral 
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approaches directly address the Smoluchowski (overdamped Fokker-Planck) equation for the 
evolution of the entire probability distribution |fnj|39| , ^f40l , |4Tl . 

Path integral formulations lead naturally to the notion of an average path, which is 
critical to the present discussion — even in one dimension. While the geometric pathway 
for barrier crossing is trivial in a one-dimensional potential like that of Fig. [I], the "speed" 
(average displacement) at every position constitutes another dimension of the average path 
- and a critical one to the present discussion. Such ID average paths (average speeds and 
distributions) were considered by Dykman, McClintock and coworkers and by Luchinsky 
and McClintock E|, both theoretically and experimentally. Zuckerman and Woolf |T1 



considered average paths in higher dimensions, computationally, while the problem of finding 
optimal reaction paths in multi-dimensional systems has a long history (e.g., p4|-^7|,|^,^1 . 

The paper is organized as follows. We first review the basic formalism for overdamped 
Langevin dynamics and importance sampling in Section |Tj. Section [HI] introduces pertinent 
path-integral results based on the Onsager-Machlup action, and reviews the so-called "Jaco- 
bian" term in some detail, as well as presenting new numerical data. Section [IV] discusses a 
new method for producing an efficient, biased ensemble of trajectories, while Section [V] gives 
the actual results from the new biasing technique. Section [Vl| addresses the relevance of our 
results to multidimensional problems. Finally, conclusions and a summary of the results are 
given in Section |V11| . 
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II. OVERDAMPED LANGEVIN DYNAMICS AND IMPORTANCE SAMPLING 



This section briefly reviews the fundamental stochastic dynamics equations and the dy- 
namic importance-sampling formalism for rate computations. We consider solely stochastic 
dynamics governed by the overdamped Langevin equation in the presence of Gaussian white 
noise. In the notation of [TJJ, such "Brownian dynamics" are described by 



dx/dt = f/mj + R(t), 



(2.1) 



where x is the configurational coordinate, t is time, f(x) = —W(x) is the force, m is the 
particle's "mass", 7 is the friction constant, and the noise R is assumed Gaussian with zero 
mean and variance given according to 



R(t) R(t') ) = {2k B T/ mi ) 6{t - f) 



(2.2) 



where kg is Boltzmann's constant and T the temperature. The equation (|2.1|) is presumed 
to be simulated according to the "Ito-like" (Euler) discretization |48,49[ 



x j+ i = xj + (fj/mj)At + Ax R , 



(2.3) 



where the subscripts j indicate quantities evaluated at time jAt = tj, so that fj = f(xj), 
and Axr is chosen from a Gaussian distribution of zero mean and variance 



a 2 = 2Atk B T/m-y . 



(2.4) 



The discretization Q2.3Q is considered to be Ito-like because the force — assumed to be 
constant over the interval At — is evaluated at the beginning of the interval; a loosely- 
termed "Stratonovich-like" approach would instead consider, formally [^,^,^1, 



fi - (fj + fj+i)/ 2 ■ (2-5) 

We note that discretizations more sophisticated than the Euler scheme ( |2.3| ) are well 
known — i.e., Runge-Kutta schemes of various orders ||50|-|52||. However, in explicit tests on 
rate calculations only for our simple system (Fig. P, the lowest-order Runge-Kutta (Heun) 
method permitted only a marginally larger time step — without losing accuracy — which 
did not justify the additional computational expense. Larger time steps tend to produce 
systematic errors, in addition to the easier-to-diagnose statistical error, regardless of the 
algorithm Rate calculations, moreover, may be especially sensitive. We also note that 
methods of higher-order than the Heun algorithm are not of interest here, because computing 
higher-order derivatives is extremely costly for the larger biomolecular systems toward which 
the present research is ultimately oriented. Thus, given our focus on rate computations and 
our interest in large systems, there seems little reason to employ anything more complicated 
than the Euler procedure (|2.3|). 

It is useful to consider, in parallel to the dynamical algorithm, the equivalent probabilistic 
description of the dynamics (|2.3|) . In particular, the explicit single-step transition probability 
density described above is given by 



T At (x j+1 \Xj) = (27T0- 5 



-1/2 



exp 



[{x j+x - Xj ) - {h/m^Abf /2<t< 



(2.6) 
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Thus the probability density for a whole trajectory ( = (xo, x%, X2, ■ ■ ■ , x n ) is given by the 
product of the single-step densities: 

n-1 

q(o = n^At^i+iN. (2.7) 

3=0 

Importance sampling is effected H|l4f by performing biased simulations which follow a 
prescription different from Q2.3| ). Naturally, trajectories generated from a biased simulation 
are distributed not according to Q, but according to some other function, D(Q. This 
alternative distribution can be used to compute probabilities and, hence, transition rates. 
One first requires the conditional probability P B (t\0;xo) to be in the final state ("B") at 
time t having started at xo at t = 0, which can be written in two equivalent ways for 
non- vanishing D: 

P B (t n \0;x ) ~ J d(Q{()h B {x n ) = J D(C)dC^rh B {x n ), (2.8) 

where d£ = Ilj=i dxj, and h B (x) is an indicator function which is unity for x in state B and 
zero otherwise. The approximation here is simply the neglect of the discretization error. In 
a biased simulation, the integral is estimated by 

P B (t n \Q;x )^± J2 §77*M*») (2-9) 

where Sp is a sampling ensemble of M trajectories chosen according to D, and where 
normalization, / d(D(Q = 1, is required. Note that the function D must be known for the 
simulation to be performed, just as (|2.6| ) and (|2.7j ) are the distributions from which unbiased 
steps and trajectories are sampled. Ultimately, rates are estimated by computing the slope 
of the linear regime in a probability vs. time plot, as discussed in Sec. [TV]. 



7 



§ 0.03 



I 



>T 0.02 J 

OP 



fi 001 

o 

5 o.oo 

> -0.8 
< 




Unbiased Crossing 
OM+Jacobian 



OM - No Jacobian 

OM+Const. - No Jacobian 



-0.4 



0.0 

x 



0.4 



0.8 



FIG. 3. Average simulation step sizes (xj—Xj-i) from many unbiased crossing events generated 
by Eq. (|2.3j). A typical crossing event is shown in Fig. 0(b). Solid circles are the data averaged from 
unbiased crossing events only for the potential of Eq. ( |5.1[ ), with a barrier height Eb = 7ksT. The 
solid line is the most probable path Eq. ( |3.6j ) based on the Onsager-Machlup action with the stochas- 
tic correction ("Jacobian") term; the dashed lines depict the uncorrected Onsager-Machlup ex- 
tremal path, Eq. (3.7) with C = 0; and, the dot-dash lines show the uncorrected Onsager-Machlup 
path with a non- vanishing constant — i.e., Eq. <K% with C = -2k B T U" (0) / (mj) 2 . The statisti- 



i.e., 

cal error in the simulation data may be gauged by the noise in the data. The data for \x\ > 0.6 are 
affected "artifactually" by the procedure of extracting crossing events from long trajectories; see 
the text, however. The data of this figure only were generated using the parameters At = 10 _4 7 _1 , 
7 _1 = 1, m = 10.98, and k B T = 249.462, largely following Refs. [8,14]; all lengths are given in 
units of the lengthscale I of Eq. (|5.1|). 
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III. PATH INTEGRALS AND MOST PROBABLE CROSSING EVENTS 



Achieving efficiency for a ID problem using dynamic importance sampling requires a 
variety of theoretical results for stochastic path integrals and their "instantons." These 
approaches have been used, for example, to address the question of the most probable 
crossing event (see Figs. 0b, |^), which can be derived in a straightforward way from the 
continuum Onsager-Machlup action used in the path integral formulation [p8 ,3C, 32-38 



In this section, we discuss the Onsager-Machlup action and its optimization at somewhat 
greater length than is required for the computational aims of this paper. This is because, 
beyond the theoretical interest in the action and especially in the so-called "Jacobian" term, 
some older literature deserves a fresh look and detailed discussion. 
The path-integral story begins with Onsager and Machlup 



and their well-known 

action for weighting continuum representations of overdamped Brownian trajectories. In 
that formulation, a continuous trajectory — presumably intended to represent a smoothed 
stochastic trajectory — is weighted according to 



probability oc exp 

S = 



s 



{k B T/m^f) 
dt C(t), 



where 



and 



Cou(t;x, x) 



x — 



rwy 



1 2 



(3.1) 
(3.2) 

(3.3) 



and where x = dx/dt and f(x) = —dU/dx. In their original paper [53|, Onsager and 
Machlup applied the Euler-Lagrange equation for functional minimization, 



dC 

dx 



d_dC 

dt dx 







(3.4) 



to the case of a single harmonic well. 

Subsequent contributions amended the original Onsager-Machlup formulation of the ac- 
tion, (|3.2|) and ( |3.3j ). First, Stratonovich realized in 1962 |21| that if one starts from the 
product of discrete-step probability densities (|2.7| ) and determines the continuum limit in a 
rigorous fashion, an additional term arises in the integral representation of the action Q3.2D , 
so that instead of (3^), the effective Lagrangian becomes 



C 



OMJ 



Oh T 

Com + U "( x ) ■ 



(3.5) 



This result was confirmed by Bach et al. [^||. In other words, the original description by 
Onsager and Machlup was incomplete for simulations performed according to the Ito-like 
algorithm ( |2.3j ). Note that this new term is proportional to the curvature of the potential 
and is dominant at a barrier top, for any non-zero noise amplitude (T > 0). 

In the mid 1970s Graham |39| , [28|] re-derived the term as a Jacobian resulting from chang- 
ing variables from the fluctuation coordinate to the configuration coordinate — assuming a 
Stratonovich-like discretization (|2.5|). Thereafter, apparently, the term has been viewed as a 
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Jacobian, although the propriety of that name in a practical simulation context is question- 
able since, to the authors' knowledge, only the Ito discretization (|2.3| ) is usable precisely, 
and that results in a constant Jacobian. More appropriate terminology might be "stochastic 
correction term." 

Both Stratonovich and Graham indicated, importantly, that the inclusion of the 
stochastic-correction/ "Jacobian" term led to a path-integral representation of the proba- 
bility distribution consistent with the overdamped Fokker-Planck (Smoluchowski) equation. 

Graham also derived the most probable crossing event associated with the corrected 
action using the Euler-Lagrange equation (^.4[), namely P5[ , 



x 



OMJ 



2k B T 
(777.7) 2 



U"[ 



x) 



c, 



where C is a constant of integration. This result is called an "instanton" 
(e.g., p3j) because it describes a rapid transition, as illustrated in Fig. 



(3.6) 

by field theorists 
||. To appreciate 

OMJ 



this, note that by taking either the positive or negative square root for in ( p.6|) , the 

velocity of ascent is seen to be the same as for descent! Yet this feature was apparently not 
appreciated until later; see below. It will, however, prove central to our goal of obtaining 
efficiency in simulations. Below, we also address the computational cost associated with 
computing a second derivative of the potential. 

It is worth noting that although the extremal path, formally, results when C < in ( fj.6| ) 
[from minimizing (|3.2|) with (3.5)1, one cannot necessarily take the constant to be negative 
or even to vanish. Observe that the left-hand side of ( |3.6| ) must be positive. Thus, in 
regions of relatively large positive curvature — such as near a well bottom — the second 
term of the right-hand side could exceed the first in magnitude, requiring C > 0. While this 
observation appears to be immaterial to our investigation of the simple bistable well (see 
Fig. |H), one could imagine the positivity of the constant having physical consequences in a 
more complicated potential with metastable intermediate states. 

Other features of the most probable crossing have also proved of great interest. Dykman 
and Krivoglaz made important observations in their study of transitions in nonequilibrium 
systems using an uncorrected Onsager-Machlup functional integral, in 1979 |[29|| . In a series of 
papers begun in 1989 and aimed primarily at systems with colored noise, Bray, McKane and 
coworkers []3Q| , |32| p5f| also provided some insights pertinent to the case of white noise without 
the correction term. (While the latter group included the Jacobian term formally, it was 
omitted from their analysis of the low-temperature limit.) Both groups derived essentially 
a special case of Graham's result (p.6|) 



x 



OM 



m 
7777 



+ c, 



(3.7) 



where C is again a constant of integration which vanishes for the most probable case. It 
also is equivalent to Graham's result at inflection points of the potential. Bray, McKane and 
coworkers noted the crucial fact mentioned above: for systems with detailed balance (i.e., 
time-invariant potentials), the ascent described by x® M is equally rapid as, and symmetric 
with, the descent (see Fig. ^b). This point is also implicit in the work of Dykman and 
Krivoglaz. Furthermore, both groups recognized that paradoxically the extremal trajectory 
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(C = 0) never occurs [0,0 because an infinite amount of time is required to reach (or 
descend from) a parabolic barrier top! This may be seen by computing the barrier crossing 
time, 

/•*/ , m dx 
h= dt = _ . (3.8) 

Such apparently formal observations belie the fact that optimal paths have proven useful 
in a variety of recent studies. Dykman et al. determined weighted distributions of paths 
(including the optimal) leading to large, non-transitioning fluctuations in a time-oscillatory 



potential ||55|| . Olender and Elber [Q discussed the Onsager-Machlup extremal barrier 



crossing, and employed a discrete version in developing an algorithm for steepest-descent 
path finding between known initial and final states. Bier et al. [37] discussed extremal 
crossings in the context of rate calculations for fluctuating barriers. Elber and Shalloway 
suggested a temperature dependence for the integration constant of ( |3.7| ) to overcome the 
divergent-crossing time difficulty, and successfully distinguished optimal paths for different 
regimes of an effective temperature parameter |38] . 



Both in order to confirm the analytic results reviewed above and to gather data that could 
usefully inform our efforts to gain efficiency in rate computations, we studied numerically 
generated crossing events. We performed very long unbiased simulations of a particle moving 
according to ( |2.3| ) in a one-dimensional bistable well (Fig. [I]) and "snipped out" a large 
number of crossing events like the one shown in Fig. |2|b. Binning the observed step sizes of 
only the crossing events according to the x position, Fig. [| shows that the average step sizes 
(xj — Xj-i) cross closely follow the prediction ( |3.6| ) which includes the stochastic correction 
term. The two theoretical predictions lacking the correction term — namely (|3.7| ) with two 
values for C — appear far less adequate by contrast. We note that the binned distributions 
appear to be highly Gaussian so that the average and most probable values essentially 
coincide. 

A comment on the data for \x\ > 0.6 in Fig. |3| is in order. The substantial turn-up of the 
data for x > 0.6 is certainly an artifact of isolating crossing-events: by definition, the final 
step or steps are right-moving. Thus, if one defines crossing events to end when the value 0.7 
is exceeded, rather than 0.8 as in the figure, the turn-up occurs for correspondingly smaller x 
values. Interestingly, while the same argument applies to the left side of the figure, the data 
are not similarly affected. Indeed, the asymmetry between the left and right edges of the 
plot is striking. While, at present, we can offer no convincing explanation, it is worth noting 
that the "OM+Jacobian" description becomes less than credible as x increases beyond the 
right inflection point of the potential, x ~ 0.58. That description, ( |3.6| ) with C = 0, suggests 
counter-intuitively that a particle which just completed a transition should fall more slowly 
than the drift velocity (effectively given by "OM - No Jacobian" for x > in Fig. |3p. Yet 
it is not clear to the present authors precisely why the applicability of the corrected OM 
theory should be limited to the region between the inflection points; intuitively, of course, the 
inflection points roughly mark the boundaries between the stable states and the barrier-top 
transition rate. 
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10 5 DIMS Trajectories 
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FIG. 4. Evolution of end-state arrival probability for rate calculations. The probability to 
arrive in the right well of the potential Eq. Q5.1|) with Eb = 9/cbT, depicted in Fig. |], is plotted 
against time, based on trajectories initiated at the bottom of the left well (x = —1) at time t = 0. 
Both unbiased (top) and DIMS (bottom) results are shown, with the latter using Eqs. (4.1) and 
(|3.6| ). The rate, k, is computed as a fit to the slope of the linear regime. The DIMS computation 
(bottom) shows a dramatic improvement in efficiency, which is quantified in Fig. ||. 
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IV. A HIGHLY EFFICIENT BIAS METHOD FOR DIMS CALCULATIONS 



The primary goal of the present work is to improve and quantify the level of efficiency 
in one-dimensional rate calculations. To that end we now introduce and evaluate a new 



biasing method, which is a variation on those discussed in our earlier work P,|l4|. The 
method combines two essential ingredients: biased crossing events which emulate the most 
probable path, together with a threshold at which the biasing is triggered. 



A. Motivation 

Computing reaction rates by simulation requires the generation of an ensemble of ap- 
propriately weighted trajectories (or a correspondingly long trajectory) exhibiting many 
barrier- crossing events. While many sampling methods are capable of generating a suitable 
ensemble of trajectories in a long period of time, it is no easy task to rapidly generate a truly 
important ensemble — containing highly probable crossing events — even for an apparently 
trivial potential like a symmetric, one-dimensional double well. The reason is not hard to 
see. In an unbiased simulation ( |2.3j ), each step is chosen from a Gaussian distribution cen- 
tered at the deterministic step in the direction of the force. For crossing events which climb 
over a barrier, a trajectory necessarily takes steps in a direction opposing the force. As the 
center of the Gaussian distribution is always downhill from the present step, the typical step 
in the ascent of a crossing event is in the "tail" of the Gaussian distribution (more or less 
so depending on the parameters entering into the width, a). As the full probability for a 
crossing event is then a product of steps in the tail of the distribution, it seems clear that 
only small fluctuations about the most probable crossing event will occur. Large fluctuations 
away from the most probable path will be exponentially damped out, as they require steps 
yet further out in the tail of the distribution. 

Our interest in the most probable path for a crossing event, then, is hardly surprising: 
to generate an "important" (albeit biased) ensemble of trajectories, one must hew to the 



most probable path. This we shall do explicitly below, using the results quoted in Sec. fu . 

But once one knows the most probable path for a single crossing event, the DIMS method 
still requires the generation of such events at appropriate intervals. That is, there is an ideal 
distribution of waiting times between events in a biased simulation (cf. Fig. §a for the 
unbiased case). To see the reason why a distribution of wait times (first passage times) is 
required, consider Fig. f|, which demonstrates the rate calculation for biased and unbiased 
simulation. One first calculates the probability to arrive in the final state (the right well in 
Fig. [l|), having started in the initial state (the left well), as a function of time. One then 
determines the slope of the linear regime. While an unbiased simulation of sufficient length 
will automatically generate data with roughly the same precision for all times shown along 
the horizontal axis in Fig. f|, a biased DIMS simulation must be designed to do so. As 



discussed in [II]], one essentially has to evaluate a separate integral for each time point, so 
part of the goal is to spread the information gathered evenly over the necessary times. This 
is the motivation behind the "thresholding" detailed below. 
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B. The Algorithm: Most Probable Crossings above a Threshold 



Specifically, trajectories are generated according to the following algorithm. Each tra- 
jectory is started from the minimum of the left well, x — — 1, of the potential ( |5.1| ) shown in 
Fig. |l|, and run for a fixed total amount of time exceeding the transient time, Pre-defining 
a threshold value, — 1 < Xh < 0, of the coordinate x, we perform unbiased simulation while 
x < Xh according to (|2.3| ). If and when the threshold value is exceeded (x > x^), we select 
steps according to 

Xi+i = %i + x c (xi) At + Ax R , (4.1) 



with x c given by either the Jacobian-augmented result ( p.6[ ) or that without ( fj.7[ ), and with 



Axr chosen from the same Gaussian distribution as in the unbiased case. In other words, 
instead of using the deterministic step fiAt/mj as in (|2.3| ), we use the most probable step 
in a crossing event to emulate unbiased crossing events. The use of a threshold away from 
the well minimum ensures a sufficiently broad distribution of waiting times between the 
artificial crossing events, which in turn permits the acquisition of data for the range of times 
necessary to compute the rate as in Fig. |j. We note, however, that it is no more difficult to 
run a single long trajectory and compute correlation functions to determine rates (e.g., PJ). 

If a Heun scheme [^0|-|52| were used for the unbiased dynamics, the biased dynamics just 
given ( [4.1| ) could be readily modified in the same way the Euler scheme itself ( |2.3| ) is modified 
in the Heun approach. Given that the underlying continuum action should be the same for 



the two integration methods, a Heun modification of (f4.1|) — or even the unmodified version 



should give similar results to those found here with the Euler method. 



C. Curvature- Adjusted Sampling Width 

Yet another refinement for the one- dimensional biasing techniques is possible, motivated 
by the time-dependent width of the Gaussian noise, cij, required for the provably optimal 
computation of the probability density at a single point on a curvature-free potential surface 
||49|| . Note that the biasing methods just described always used a constant width, o~j = a 
given in fl2.4Q . In a spirit similar to Wagner's approach described in ||49| , one can ask the 



question: 'What is the optimal sampling scheme to travel between two fixed points, with 
one intermediate step, on a surface with arbitrary curvature?" Answering this question is 
not difficult and suggests that the local curvature influences the optimal width. 

Following Wagner (see J49|), one needs to observe first that the optimal sampling density 
at time t = At is exactly the distribution of unbiased two-step paths which begin at xq at 
t = and end at x^ at t = 2 At. This constitutes the perfect sampling distribution because 
it is precisely that (tiny) subset of unbiased trajectories which end at the pre-defined value 
of interest, and which are distributed naturally. Using the single-step Gaussian distributions 
(|2.6D, the two-step distribution is 



TaHx 2 |£i)T A4 (xi|xo) = ^: ex P \ ^ 
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where a still represents the unbiased value and fa again gives the force at X{. By rearranging 
the terms, completing the square and approximating f\ ~ f — (xj — x ) U"(x ), one finds 



X0+X2 ! 1—a 



TAt(x2\xi)T At (xi\x Q ) ~ c x exp 



2 \l-a+a 2 /2 



(4.3) 



2 [ff/^/2(l - a + a 2 /2) 



where c is a constant independent of X\ in this approximation, and the dimensionless cur- 
vature is a = U"(xo)At/rri'y. 

The result ( |4.3| ) for the optimal distribution of X\ values illustrates several interesting 
points. First, the distribution is independent of the force to first order in a. Second, while 
the correction to the expected mean of the distribution, (xo + ^2)/2, is second order in a, 
that for the width, a, is first order — noting that the factor \/2 is expected from the work 
of Wagner and actually approaches unity for a large number of time steps |4!| . The optimal 
sampling width in fact decreases at the barrier top, where a < 0, in the small At limit 
(a ^> a 2 ). Thus, the distribution (|4.3| ) motivates a further bias for sampling, namely use of 
the width 



for sampling from Gaussian distributions near the point x, where a is to be computed using 
U"(x). We investigate this refinement in the next section. 




(4.4) 
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FIG. 5. Efficiency in Rate Computations using the DIMS approach. The standard deviation, 
<7fc, of a set of 20 rate estimates computed by either unbiased or DIMS simulation is plotted 
against the length of each simulation (top 2 plots). The spans of the horizontal arrows measure 
the efficiency by indicating the differences in simulation length required to obtain a given variance 
(i.e., precision). For example, when the barrier height is Eb = 9ksT, the best DIMS approaches 
are roughly 300 times faster in obtaining a precision given by = 10 -4 . The "DIMS- Wagner" 
algorithm is from our earlier work [14], while the other DIMS simulations use the algorithm of 
Eq. ( |4.1[ ) and the surrounding text. For the latter, the most probable step is chosen according 
to either the Jacobian-augmented formulation ( |3.6| ) or that without (|3.7| ), as indicated. The label 
"CURV" indicates that the Gaussian sampling width was modified according to (4.4). The bottom 
plot shows rate estimates, k, for the 9kgT barrier height. Both DIMS [Eqs. ( |4.1| ) and ( |3.6| )1 and 
unbiased estimates converge toward a common result with increasing simulation length. The error 
bars indicate the standard error of the mean, and under-estimate the 95% confidence interval. 
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V. RESULTS: COMPARISON OF EFFICIENCY 



We demonstrate the capability of the DIMS algorithms of the previous section by quan- 
tifying their efficiency for rate computations in the simple bistable potential shown in Fig. 

B 

U(x) = E b ((x/l) 2 -lf , (5.1) 

where Eb is the barrier height and I is the lengthscale of the problem. The central result 
is that one must account for the most probable crossing to gain maximum speed-up in rate 
computation as compared with unbiased simulation. Not surprisingly, the efficiency increases 
with barrier height. Yet even for the relatively low barrier height of Eb = 5k #T, we achieve 
roughly a 20-times efficiency improvement — i.e., DIMS rate calculations are approximately 
20 times as fast for a given level of precision. That factor increases to 300 for a 9k bT barrier. 
While such gains will not be readily extendible to multi-dimensional systems, it is important 
to understand and demonstrate the ingredients necessary for optimal performance. 

Fig. H shows our results for the potential ( |5.1| ) for two different barrier heights, Eb/ksT = 
5 and 9. The biasing methods accounting for the most probable crossing are significantly 
superior for the larger barrier. The "DIMS- Wagner" algorithm — which takes no account 



of the most probable path — refers to the technique described in Ref . [HJ] . It performs only 
modestly well for the 9k bT barrier, and its efficiency is very sensitive to the fixed simulation 
length. All the other DIMS procedures employed the algorithm of the previous section, 
Q4.1D , above a threshold Xh = —0.7, with trajectories initiated at x — —1. The presence or 
absence of the "Jacobian" (stochastic correction term) reflects whether ( |3.6| ) or ( |3.7| ) was 



used to complete ( |4.1|) , and "CURV" indicates that the curvature- modified width (|4.4| ) was 



used in place of the unbiased width (|2.4|). Rates, k, were calculated using the method noted 
in Sec. [IV A| and Fig. |j. Efficiency is computed by estimating the relative simulation lengths 
needed to obtain a desired degree of precision, given by the variance of the rate estimates, 

1 n 

** = -£fo-<*» 2 . (5-2) 

i=l 

where n = 20 is the number of simulations performed for each data point, ki is the rate 
computed for the ith simulation, and (k) is the mean of the n rate estimates. The horizontal 
arrow spanning the DIMS and unbiased results for Ef, = 9ksT at a precision of = 10 -4 , 
for example, exceeds two decades — indicating that the DIMS computations are more than 
100 times faster. 

The effectiveness of the DIMS formulation excluding the Jacobian — i.e., based on (|4.1|) 
and ( p.7|) with C = — deserves further comment. While the Jacobian- augmented DIMS 
simulation is slightly superior for Eb = 9ksT, the insensitivity to including the correction 
term is surprising given the sharp contrast demonstrated in Fig. || The lesson appears to be 
that, at least for the parameters studied, the motion in the immediate neighborhood of the 
barrier top (where the correction term, proportional to the curvature, has greatest effect) is 
less important to the anatomy of a crossing event than the (rapid) climb and descent. In the 
long term, the success of the uncorrected approach could facilitate the extension of DIMS 
to multi-dimensional systems, since that approach does not require computation of second 
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derivatives of the potential. Future work may show this to save a substantial amount of 
computer time. 

We note that our efficiency estimates have excluded the "overhead" cost of implementing 
the DIMS method. This cost depends on the optimization of one's code, and as it happens, 
our code is sub-optimal for unbiased simulation, so that there is no overhead at all. There 
are, however, inherent overhead costs in DIMS that cannot be optimized away. While 
the dynamics employing the Jacobian-augmented most probable path ( |3.6| ) requires the 
computation of a second derivative at every step, our results show that the simpler form 
( fj.7p is nearly as good and requires only the force. Calculating the force, it should be 
remembered, is not an overhead cost because this must be done in unbiased simulation 
Q2.3|) anyway. The only notable cost inherent in the DIMS method, then, is computing the 
error associated with biased computation — in order to correct for it as discussed in Refs. 



fljbi . This correction entails computing the ratio of two Gaussian terms (or, equivalently, 
the difference of two logarithms) at every step. Compared with the fixed cost of unbiased 
simulation — computing the force and generating a high quality pseudo-random number at 
every step — and its inherent inefficiency with long waiting times, the DIMS costs are far 
overshadowed by the efficiency gains which here exceed one and two orders of magnitude. 

For completeness, we give a number of further details, which apply to both the unbiased 
and DIMS results. A simulation consisted of N trajectories (see Fig. |5|) initiated at x — — 1 
at time t = 0. The time steps were At = 0.0037~ 1 for = bksT (having set 7=1) 
and At = 0.0017 -1 for Eb = 9/cbT. These were determined to be close to the maximum 
values for which the rate estimates did not change as At increased in unbiased simulation. 
As discussed above, the rate is computed as the slope of the linear regime in a plot of the 
arrival probability (to be in the right well, x > 0) as a function of time; see Fig. The 
slopes (rates, ki) were computed from a least-squares fit to 10 data points, the t values of 
which were held fixed for a given barrier height. The particle mass, m, and the thermal 
energy, ksT, were both set to 1. Note that Fig. uses the parameters given in the caption. 



A. What is the Optimal Efficiency? 

"The system is so simple. How does one know whether a 300-time improvement in 
efficiency is impressive?" So a skeptic might wonder, and the attempt to answer seems a 
worthwhile exercise. 

The basic point is that computing a rate by simulation involves the simultaneous calcu- 
lation of a series of difficult integrals of the form ( |2.8| ) discussed in Sec. |T[ Each data point 
in Fig. [| is an estimate for such an integral. 

We can try to estimate the minimum number of trajectories needed for the rate com- 
putation by multiplying together estimates for the following: (i) the number of trajectories 
needed to estimate the probability density to be at a single location, x, in state B at a given 
time, P{tj] x G B\0; A); (ii) the number of discrete locations x in state B required to estimate 
the probability for the whole state; and, (iii) the number of independent time points needed 
to estimate a slope in the linear regime. Regarding (i), only in the case of a constant force 
can P(tj] x <E B\0; A) be computed exactly — equivalently, with a single trajectory. One 



might expect that at least 10 trajectories would be required for any real surface, setting (i). 
In a similar manner for (ii), at least 10 points should be required to characterize a state 
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(which, in principle, is known only numerically). The number of independent time points 
is a slightly more complex issue since some (though not all) trajectories from a given time 
point may also be used to estimate another. Conservatively, then, we use the estimate three 
for (iii). Our estimate for the minimum number of trajectories required to calculate a rate 
is thus 300. We believe our DIMS results of Fig. ^] compare favorably with this heuristic - 
and conservative — theoretical minimum. 
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VI. POSSIBLE EXTENSIONS TO MULTI-DIMENSIONAL PROBLEMS 



While the results presented here for a one-dimensional potential seem a far cry from 
a high-dimensional biomolecular system, we believe they teach important lessons for the 
large system. The simplest point is that a biased trajectory must closely mimic the natural 
barrier-crossing dynamics if efficiency in rate computations is to be obtained. Indeed, we 
have found that a poor bias can be worse than no bias at all. It is not enough to know 
- as one does automatically in one dimension — even the optimal geometric channel for a 
transition: the size of the steps along that geometric path are critical, as we have shown. 

Although the problem of finding channels is extremely challenging in itself, let us ask 
"How can one compute the rate for a large system assuming the channels are known?" A 
natural choice would be to start with an initial trajectory within each channel, and then 
attempt either to optimize it |56| , ^5| -^,|6|, p6| , p7| , p8| or to generate an ensemble of trajectories 
from it |f7|JT3[]. Given the importance of closely following the optimal course, it seems natural 
to use an optimization or sampling scheme which builds in the knowledge of the most 
probable path (|3.7|) and its mult i- dimensional analog. The risks, otherwise, could be great: 
since a high-dimensional path will be very rough, it is easy to imagine a multi-step segment 
of a trajectory becoming trapped in a region of the potential surface with far too few or 
too many steps to be even close to optimal. One idea for overcoming this difficulty would 
be to use a scheme capable of removing and inserting time steps, in order to search for an 
appropriate distribution of steps along a pre-defined geometric path. We intend to pursue 
further investigations along these lines. 

Returning to the issue of finding multi-dimensional channels in the first place, we note 
that the DIMS method is ideally suited to attack this problem since it generates an en- 
semble of completely independent trajectories. Indeed, we have already developed an al- 
gorithm which has proved capable of efficiently finding distinct, important channels in a 
multi-dimensional system [f)9[| . We have named the idea the "soft-ratcheting algorithm," 



and we note that its efficiency is thus far limited to finding channels, rather than determin- 
ing rates. The essence of the technique is simple: generate an unbiased step and accept 
it with a probability (hence the "softness") depending on how far the trajectory has pro- 
gressed toward the target state. To complete the calculation in the DIMS formulation, one 
then estimates the overall acceptance probability — which is an inexpensive calculation in a 
large system. Note that the soft-ratcheting algorithm requires no second derivatives of the 
potential. 

Finally, a timescale problem could prove serious, even though we do not expect it to be 
nearly the handicap it is for molecular dynamics. In particular, a fundamental limitation 
of applying the DIMS method (or a related approach p|-|To]]) to multi-dimensional problems 
is the barrier-crossing time, In practical terms, tb shows up as the transient time prior 
to the linear regime in a plot used for rate evaluation (Fig. ^). Probability cannot accrue, 
after all, until crossing events have occurred. In a large system, tb is the limiting timescale 
for applying a method like DIMS to computing rates. Since a reasonable number, say N, 
crossing events will be needed to estimate the rate, one would have to simulate for a time 
exceeding Ntb- The authors are unaware of any estimates of tb for biomolecular systems, 
but we note that — for the DIMS method to potentially yield a rate estimate — U would 
have to be less than a nanosecond for a large, explicitly-solvated protein. Recent work on 
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large time steps p0|,57| , ^8 , |15 | holds promise, however, for attacking the timescale problem. 
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VII. SUMMARY AND CONCLUSIONS 



We have demonstrated substantial increases in efficiency for simulation-based calcula- 
tions of transition rates in bistable potentials with modest barrier heights, 5ksT and 9/cbT, 
using new biasing methods within the dynamic importance sampling (DIMS) formulation 
,nj. Computations were sped up by a factor of approximately 300 for the 9&bT barrier, 



and the primary results (Fig. [|) suggest the speed-up will increase significantly for larger 
barriers. The critical ingredient in our efficiency was close emulation of probable crossing 
events suggested by the Onsager-Machlup action, (|3.2j ) and (|3.3| ). 

The simple one-dimensional problem has been addressed from a variety of theoretical and 
numerical perspectives in an effort to pave the way for more difficult problems. In Sec. |II| we 
examined the stochastic correction — or "Jacobian" — term ( |3.5| ) in the Onsager-Machlup 
action from theoretical and numerical perspectives. There, we also discussed the desired 
distribution of waiting times between artificial crossing events, as well as the effect (and 
utility) of the curvature of the potential. After presenting the explicit results for efficiency 
levels in rate computations, we discussed the optimal efficiency one could hope to attain in 
principle. 

In commenting on the extension of the DIMS method to large, high- dimensional systems 
in Sec. |VI|, we noted that the problem may be conceptually broken up into two parts: finding 
the geometric channels, and then sampling trajectories within those channels. The DIMS 
method is ideally suited for the first step, channel-finding, and we described an explicit 
algorithm effective in that capacity. Our hope is that the results of the present paper will be 
useful in constructing techniques for the second stage, single-channel trajectory sampling. 
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